131 research outputs found

    A Probabilistic Approach for Spatio-Temporal Phase Unwrapping in Multi-Frequency Phase-Shift Coding

    Get PDF
    Multi-frequency techniques with temporally encoded pattern sequences are used in phase-measuring methods of 3D optical metrology to suppress phase noise but lead to ambiguities that can only be resolved by phase unwrapping. However, classical phase unwrapping methods do not use all the information to unwrap all measurements simultaneously and do not consider the periodicity of the phase, which can lead to errors. We present an approach that optimally reconstructs the phase on a pixel-by-pixel basis using a probabilistic modeling approach. The individual phase measurements are modeled using circular probability densities. Maximizing the compound density of all measurements yields the optimal decoding. Since the entire information of all phase measurements is simultaneously used and the wrapping of the phases is implicitly compensated, the reliability can be greatly increased. In addition, a spatio-temporal phase unwrapping is introduced by a probabilistic modeling of the local pixel neighborhoods. This leads to even higher robustness against noise than the conventional methods and thus to better measurement results

    A Multispectral Light Field Dataset and Framework for Light Field Deep Learning

    Get PDF
    Deep learning undoubtedly has had a huge impact on the computer vision community in recent years. In light field imaging, machine learning-based applications have significantly outperformed their conventional counterparts. Furthermore, multi- and hyperspectral light fields have shown promising results in light field-related applications such as disparity or shape estimation. Yet, a multispectral light field dataset, enabling data-driven approaches, is missing. Therefore, we propose a new synthetic multispectral light field dataset with depth and disparity ground truth. The dataset consists of a training, validation and test dataset, containing light fields of randomly generated scenes, as well as a challenge dataset rendered from hand-crafted scenes enabling detailed performance assessment. Additionally, we present a Python framework for light field deep learning. The goal of this framework is to ensure reproducibility of light field deep learning research and to provide a unified platform to accelerate the development of new architectures. The dataset is made available under dx.doi.org/10.21227/y90t-xk47 . The framework is maintained at gitlab.com/iiit-public/lfcnn

    Improved Separation of Polyphonic Chamber Music Signals by Integrating Instrument Activity Labels

    Get PDF
    The separation of music signals is a very challenging task, especially in case of polyphonic chamber music signals because of the similar frequency ranges and sound characteristics of the different instruments to separate. In this work, a joint separation approach in the time domain with a U-Net architecture is extended to incorporate additional time-dependent instrument activity information for improved instrument track extractions. Different stages are investigated to integrate the additional information, but an input before the deepest encoder block achieves best separation results as well as highest robustness against randomly wrong labels. This approach outperforms a label integration by multiplication and the input of a static instrument label. Targeted data augmentation by incoherent mixtures is used for a trio example of violin, trumpet, and flute to improve separation results. Moreover, an alternative separation approach with one independent separation model for each instrument is investigated, which enables a more flexible architecture. In this case, an input after the deepest encoder block achieves best separation results, but the robustness is slightly reduced compared to the joint model. The improvements by additional information on active instruments are verified by using real instrument activity predictions for both the joint and the independent separation approaches

    Light Field Reconstruction using a Generic Imaging Model

    Get PDF

    Forum Bildverarbeitung 2022

    Get PDF
    Bildverarbeitung verknüpft das Fachgebiet die Sensorik von Kameras – bildgebender Sensorik – mit der Verarbeitung der Sensordaten – den Bildern. Daraus resultiert der besondere Reiz dieser Disziplin. Der vorliegende Tagungsband des „Forums Bildverarbeitung“, das am 24. und 25.11.2022 in Karlsruhe als Veranstaltung des Karlsruher Instituts für Technologie und des Fraunhofer-Instituts für Optronik, Systemtechnik und Bildauswertung stattfand, enthält die Aufsätze der eingegangenen Beiträge

    Influence of input data representations for time-dependent instrument recognition [Einfluss von Eingangsdaten-Darstellungen für die zeitabhängige Instrumentenerkennung]

    Get PDF
    Ein wichtiger Vorverarbeitungsschritt für verschiedene Musiksignalverarbeitungsalgorithmen ist die Schätzung der spielenden Instrumente in Musikaufnahmen. Zu diesem Zweck wird die zeitabhängige Instrumentenerkennung in diesem Ansatz durch ein neuronales Netz mit Residual-Blöcken realisiert. Da Musiksignalverarbeitungsaufgaben unterschiedliche Zeit-Frequenz-Darstellungen als Eingabematrizen verwenden, wird in dieser Arbeit der Einfluss verschiedener Eingangsdarstellungen für die Instrumentenerkennung analysiert. Dabei werden sowohl dreidimensionale Eingänge von Kurzzeit-Fourier-Transformation (STFT) mit einer zusätzlichen auf Phaseninformation basierenden Zeit-Frequenz-Darstellung als auch die Magnituden der zweidimensionalen STFT oder der Constant-Q-Transformation (CQT) untersucht. Als zusätzliche Phasendarstellungen werden das Produktspektrum (PS), das auf der modifizierten Gruppenlaufzeit basiert, und die Frequenzfehlermatrix (FE-Matrix), welche von der Momentanfrequenz abgeleitet ist, verwendet. Die Trainings- und Evaluierungsprozesse werden auf Basis des MusicNet-Datensatzes durchgeführt, der die Schätzung von sieben Instrumenten ermöglicht. Durch eine höhere Anzahl an Frequenzbins in den Eingangsdarstellungen kann eine um etwa 2 % im F1-Score verbesserte Instrumentenerkennung erreicht werden. Im Vergleich zur Literatur kann die Instrumentenerkennung auf Frame-Ebene für verschiedene Eingangsdarstellungen verbessert werden
    • …
    corecore